National Repository of Grey Literature 24 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Emotion Recognition from Acted and Spontaneous Speech
Atassi, Hicham ; Přibil, Jiří (referee) ; Zahradník, Pavel (referee) ; Smékal, Zdeněk (advisor)
Dizertační práce se zabývá rozpoznáním emočního stavu mluvčích z řečového signálu. Práce je rozdělena do dvou hlavních častí, první část popisuju navržené metody pro rozpoznání emočního stavu z hraných databází. V rámci této části jsou představeny výsledky rozpoznání použitím dvou různých databází s různými jazyky. Hlavními přínosy této části je detailní analýza rozsáhlé škály různých příznaků získaných z řečového signálu, návrh nových klasifikačních architektur jako je například „emoční párování“ a návrh nové metody pro mapování diskrétních emočních stavů do dvou dimenzionálního prostoru. Druhá část se zabývá rozpoznáním emočních stavů z databáze spontánní řeči, která byla získána ze záznamů hovorů z reálných call center. Poznatky z analýzy a návrhu metod rozpoznání z hrané řeči byly využity pro návrh nového systému pro rozpoznání sedmi spontánních emočních stavů. Jádrem navrženého přístupu je komplexní klasifikační architektura založena na fúzi různých systémů. Práce se dále zabývá vlivem emočního stavu mluvčího na úspěšnosti rozpoznání pohlaví a návrhem systému pro automatickou detekci úspěšných hovorů v call centrech na základě analýzy parametrů dialogu mezi účastníky telefonních hovorů.
Estimation of Fundamental Speech Frequency
Ráček, Tomáš ; Vlach, Jan (referee) ; Vondra, Martin (advisor)
The Bachelor thesis focuses on algorithms with respect to estimation of fundamental speech frequency. First part is introduce to the questions of speech signals and the thesis at this point gives a clue what the core is going to be about. In the second part the nature of speech signal is explained, as well as the process of it’s creation by a person and models for speech generation. In the chapter 3 processing of acoustic signals are described, where pre-processing, segmentation and application of Hamming window on the same acoustic speech signal are included. The next chapter reports on pitch speech frequency signal as a physical magnitude and it's derivation from the pitch period. Furthermore describes, fundamental frequency creation in speech organs, scale range for different people, properties that carries and finally possibilities of it’s usage. Chapter 5 deals with essential principles defining pitch speech frequency in time, frequency and cepstral domain. Chapter 6 contains description of principles, used in situations, where the speech signal is devalued by noise. In the next chapter author describes design and implementation of selected principle. Furthermore, author presents results that have been achieved with this specific principle and compares them to the results of ordinary autocorrelation principle. The final chapter summarises the thesis and discusses about possible further part, extension or improvement of the algorithm.
Assessment of speech signal quality
Tuleja, Peter ; Balík, Miroslav (referee) ; Míča, Ivan (advisor)
This paper discusses methods for evaluating the quality of the speech signal. Briefly describe the subjective methods for determining the quality of the speech signal. From subjective methods the pair-wise comparison and MOS score are presented. Objective intrusive methods are described in more detailed way - namely methods of the segmental SNR evaluated in time domain, method of the segmental SNR evaluated in the frequency domain and frame normalization method which uses LSE based estimator. At the end of this paper is described an experiment, in which the aforementioned methods are compared and than statistically evaluated.
Room Impulse Response Estimation from Speech Signal
Gregor, Adam ; Szőke, Igor (referee) ; Černocký, Jan (advisor)
Jakýkoliv zvuk šířící se místností je zkreslen impulsní odezvou této místnosti. Měření těchto impulsních odezev bylo vždy důležitou úlohou akustiky, která v dnešní době ještě nabyla na důležitosti, díky možnosti požití impulsních odezev při augmentaci dat pro účely trénování automatických rozpoznávačů řeči. Standardně je impulsní odezva místnosti měřena za pomoci čisté a zkreslené formy zvukového signálu. To je však v praxi nepraktické (například u domácích asistentů či chytrých domů), neboť zde je k dispozici jen zkreslený signál. Tato bakalářská práce se zabývá odhadem impulsní odezvy "naslepo, pouze pomocí zkresleného řečového signálu. Nejdříve jsme za použití datasetu BUT ReverbDB re-implementovali standardní techniky pro měření impulsní odezvy z čistého/zkresleného signálu. Poté jsme testovali dvě techniky odhadující impulsní odezvu místnosti pouze ze zkreslené řeči.  První technika k tomu používá impulsní fonémy ve zkreslené řeči, u kterých se předpokládá, že se podobají impulsním odezvám místností. Bylo testováno průměrování a dekonvoluce těchto fonémů za účelem zvýšení kvality a robustnosti odhadu. Druhá technika využívá regresní neuronové sítě generující impulsní odezvy místností z řeči na vstupu. Ačkoliv žádná z navrhovaných technik nedosahuje odhadů na úrovni standardních měření, mají tyto odhady potenciál při augmentaci dat pro trénování automatických rozpoznávačů řeči.
Simple text-independent voice lock - speaker verification software system
Kotulek, Milan ; Dolenský,, Jan (referee) ; Staněk, Miroslav (advisor)
A brief introduction into biometrics is described in this thesis leading to description and to design a solution of verification system using speech analysis. The designed system provides firstly basic signal processing, then vowel recognition in fluent Czech speech. For each found vowel, observed speech features are calculated. The created GUI application was tested on created speaker database and its efficiency is approximately 54 % for short testing utterances, and approx. 88 % for long testing utterances respectively.
Application of statistical analysis of speech in patients with Parkinson's disease
Bijota, Jan ; Mžourek, Zdeněk (referee) ; Galáž, Zoltán (advisor)
This thesis deals with speech analysis of people who suffer from Parkinson’s disease. Purpose of this thesis is to obtain statistical sample of speech parameters which helps to determine if examined person is suffering from Parkinson’s disease. Statistical sample is based on hypokinetic dysarthria detection. For speech signal pre-processing DC-offset removal and pre-emphasis are used. The next step is to divide signal into frames. Phonation parameters, MFCC and PLP coefficients are used for characterization of framed speech signal. After parametrization the speech signal can be analyzed by statistical methods. For statistical analysis in this thesis Spearman’s and Pearson’s correlation coefficients, mutual information, Mann-Whitney U test and Student’s t-test are used. The thesis results are the groups of speech parameters for individual long czech vowels which are the best indicator of the difference between healthy person and patient suffering from Parkinson’s disease. These result can be helpful in medical diagnosis of a patient.
Pause Identification in Degraded Speech Signal
Podloucká, Lenka ; Balík, Miroslav (referee) ; Smékal, Zdeněk (advisor)
This diploma thesis deals with pause identification with degraded speech signal. The speech characteristics and the conception of speech signal processing are described here. The work aim was to create the reliable recognizing method to establish speech and non-speech segments of speech signal with and without degraded speech signal. The five empty pause detectors were realized in computing environment MATLAB. There was the energetic detector in time domain, two-step detector in spectral domain, one-step integral detector, two-step integral detector and differential detector in cepstrum. The spectral detector makes use of energetic characteristics of speech signal in first step and statistic analysis in second step. Cepstral detectors make use of integral or differential algorithms. The detectors robustness was tested for different types of speech degradation and different values of Signal to Noise Ratio. The test of influence different speech degradation was conducted to compare non-speech detection for detectors by ROC (Receiver Operating Characteristic) Curves.
Django framework based web application for objective analysis of hypokinetic dysarthria
Čapek, Karel ; Zvončák, Vojtěch (referee) ; Galáž, Zoltán (advisor)
This master´s thesis deals with the calculation of parameters that would be able to differentiate healthy speech and speech impaired by hypokinetic dysarthria. There was staged hypokinetic dysarthria, which is a motoric disorder of speech and vocal tract. Were studied speech signal processing methods. Further parameters were studied, which could well differentiate healthy and diseased speech. Subsequently, these parameters were programmed in Python programming language. The next step was to create a web application in Django framework, which is used for the analysis of the dyzartic speech.
Software analysator and tuner of vocal records
Smatana, Tomáš ; Dolenský,, Jan (referee) ; Staněk, Miroslav (advisor)
This thesis deals with the analysis methods used to fundamental frequency detection and methods for changing the fundamental frequency of the audio signal containing vocals. It also explores general musical intonation theory. On the basis of this analysis, suitable methods are selected for the follow realization software fine-tuning vocal audio signal.
Calculation of speech rate
Galáž, Zoltán ; Smékal, Zdeněk (referee) ; Mekyska, Jiří (advisor)
his semestral thesis deals with a design of system for calculating the rate of speech. The sys-tem consists of several block, such as signal pre-processing block and its segmentation into smaller parts, block of the feature calculation, block of the feature vector quantization and finally block calculating the actual rate. The first step is a change of the input speech signal into a form suitable for the feature extraction. In next step these features are assigned to the calculated centroids. The change of centroid means change of phonemes. The system will record the following boundaries of fonems contained in speech and calculates its rate.

National Repository of Grey Literature : 24 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.